Skip to content

Conversation

@scaliby
Copy link
Member

@scaliby scaliby commented Nov 7, 2025

Description

When cluster supports placement-policy, we additionally need to append nodeSelector.label to workloads scheduled on this cluster.

Note: this adds a new label to workloads scheduled on non-nap clusters as well. Clarifying if this is okay.

Issue

b/455642310

Testing

Tested by scheduling an actual workload.

@scaliby scaliby force-pushed the workload-policy-label branch from 9ae1d4f to d28e51e Compare November 7, 2025 12:11
@scaliby scaliby marked this pull request as ready for review November 7, 2025 12:36
Copy link
Collaborator

@FIoannides FIoannides left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but please test this on a non-NAP cluster before merging



def create_placement_policy_label(system: SystemCharacteristics) -> str:
if system.accelerator_type != AcceleratorType.TPU:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this guard? I'm surprised that placement_policy_label can be rendered as empty string if it's not supported, but also if accelerator is not TPU. If we need to add a label whenever placement policy is supported, then I'd expect create to return something whenever is_supported is true.

nodeSelector:
{accelerator_label}
{machine_label}
{placement_policy_label}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's too bad we have no test coverage for this, no unit tests cover the actual yaml content, and goldens only capture that some change happened in the template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants